2,104 research outputs found
Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning
Deep Learning has recently become hugely popular in machine learning,
providing significant improvements in classification accuracy in the presence
of highly-structured and large databases.
Researchers have also considered privacy implications of deep learning.
Models are typically trained in a centralized manner with all the data being
processed by the same training algorithm. If the data is a collection of users'
private data, including habits, personal pictures, geographical positions,
interests, and more, the centralized server will have access to sensitive
information that could potentially be mishandled. To tackle this problem,
collaborative deep learning models have recently been proposed where parties
locally train their deep learning structures and only share a subset of the
parameters in the attempt to keep their respective training sets private.
Parameters can also be obfuscated via differential privacy (DP) to make
information extraction even more challenging, as proposed by Shokri and
Shmatikov at CCS'15.
Unfortunately, we show that any privacy-preserving collaborative deep
learning is susceptible to a powerful attack that we devise in this paper. In
particular, we show that a distributed, federated, or decentralized deep
learning approach is fundamentally broken and does not protect the training
sets of honest participants. The attack we developed exploits the real-time
nature of the learning process that allows the adversary to train a Generative
Adversarial Network (GAN) that generates prototypical samples of the targeted
training set that was meant to be private (the samples generated by the GAN are
intended to come from the same distribution as the training data).
Interestingly, we show that record-level DP applied to the shared parameters of
the model, as suggested in previous work, is ineffective (i.e., record-level DP
is not designed to address our attack).Comment: ACM CCS'17, 16 pages, 18 figure
Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models
Inference of latent feature models in the Bayesian nonparametric setting is
generally difficult, especially in high dimensional settings, because it
usually requires proposing features from some prior distribution. In special
cases, where the integration is tractable, we could sample new feature
assignments according to a predictive likelihood. However, this still may not
be efficient in high dimensions. We present a novel method to accelerate the
mixing of latent variable model inference by proposing feature locations from
the data, as opposed to the prior. First, we introduce our accelerated feature
proposal mechanism that we will show is a valid Bayesian inference algorithm
and next we propose an approximate inference strategy to perform accelerated
inference in parallel. This sampling method is efficient for proper mixing of
the Markov chain Monte Carlo sampler, computationally attractive, and is
theoretically guaranteed to converge to the posterior distribution as its
limiting distribution.Comment: Previously known as "Accelerated Inference for Latent Variable
Models
Optimization of Annealed Importance Sampling Hyperparameters
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates
the intractable marginal likelihood of deep generative models. Although AIS is
guaranteed to provide unbiased estimate for any set of hyperparameters, the
common implementations rely on simple heuristics such as the geometric average
bridging distributions between initial and the target distribution which affect
the estimation performance when the computation budget is limited. In order to
reduce the number of sampling iterations, we present a parameteric AIS process
with flexible intermediary distributions defined by a residual density with
respect to the geometric mean path. Our method allows parameter sharing between
annealing distributions, the use of fix linear schedule for discretization and
amortization of hyperparameter selection in latent variable models. We assess
the performance of Optimized-Path AIS for marginal likelihood estimation of
deep generative models and compare it to compare it to more computationally
intensive AIS
Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation
New communication standards need to deal with machine-to-machine
communications, in which users may start or stop transmitting at any time in an
asynchronous manner. Thus, the number of users is an unknown and time-varying
parameter that needs to be accurately estimated in order to properly recover
the symbols transmitted by all users in the system. In this paper, we address
the problem of joint channel parameter and data estimation in a multiuser
communication channel in which the number of transmitters is not known. For
that purpose, we develop the infinite factorial finite state machine model, a
Bayesian nonparametric model based on the Markov Indian buffet that allows for
an unbounded number of transmitters with arbitrary channel length. We propose
an inference algorithm that makes use of slice sampling and particle Gibbs with
ancestor sampling. Our approach is fully blind as it does not require a prior
channel estimation step, prior knowledge of the number of transmitters, or any
signaling information. Our experimental results, loosely based on the LTE
random access channel, show that the proposed approach can effectively recover
the data-generating process for a wide range of scenarios, with varying number
of transmitters, number of receivers, constellation order, channel length, and
signal-to-noise ratio.Comment: 15 pages, 15 figure
Economic Complexity Unfolded: Interpretable Model for the Productive Structure of Economies
Economic complexity reflects the amount of knowledge that is embedded in the
productive structure of an economy. It resides on the premise of hidden
capabilities - fundamental endowments underlying the productive structure. In
general, measuring the capabilities behind economic complexity directly is
difficult, and indirect measures have been suggested which exploit the fact
that the presence of the capabilities is expressed in a country's mix of
products. We complement these studies by introducing a probabilistic framework
which leverages Bayesian non-parametric techniques to extract the dominant
features behind the comparative advantage in exported products. Based on
economic evidence and trade data, we place a restricted Indian Buffet Process
on the distribution of countries' capability endowment, appealing to a culinary
metaphor to model the process of capability acquisition. The approach comes
with a unique level of interpretability, as it produces a concise and
economically plausible description of the instantiated capabilities
Adaptive Annealed Importance Sampling with Constant Rate Progress
Annealed Importance Sampling (AIS) synthesizes weighted samples from an
intractable distribution given its unnormalized density function. This
algorithm relies on a sequence of interpolating distributions bridging the
target to an initial tractable distribution such as the well-known geometric
mean path of unnormalized distributions which is assumed to be suboptimal in
general. In this paper, we prove that the geometric annealing corresponds to
the distribution path that minimizes the KL divergence between the current
particle distribution and the desired target when the feasible change in the
particle distribution is constrained. Following this observation, we derive the
constant rate discretization schedule for this annealing sequence, which
adjusts the schedule to the difficulty of moving samples between the initial
and the target distributions. We further extend our results to -divergences
and present the respective dynamics of annealing sequences based on which we
propose the Constant Rate AIS (CR-AIS) algorithm and its efficient
implementation for -divergences. We empirically show that CR-AIS
performs well on multiple benchmark distributions while avoiding the
computationally expensive tuning loop in existing Adaptive AIS
Simulation-based inference using surjective sequential neural likelihood estimation
We present Surjective Sequential Neural Likelihood (SSNL) estimation, a novel
method for simulation-based inference in models where the evaluation of the
likelihood function is not tractable and only a simulator that can generate
synthetic data is available. SSNL fits a dimensionality-reducing surjective
normalizing flow model and uses it as a surrogate likelihood function which
allows for conventional Bayesian inference using either Markov chain Monte
Carlo methods or variational inference. By embedding the data in a
low-dimensional space, SSNL solves several issues previous likelihood-based
methods had when applied to high-dimensional data sets that, for instance,
contain non-informative data dimensions or lie along a lower-dimensional
manifold. We evaluate SSNL on a wide variety of experiments and show that it
generally outperforms contemporary methods used in simulation-based inference,
for instance, on a challenging real-world example from astrophysics which
models the magnetic field strength of the sun using a solar dynamo model
Experiencias de innovación pedagógica y tecnológica en la implementación del Diplomado en Programación Pedagógica para la Docencia Universitaria por Competencias a través de la plataforma Moodle en la UAEM
Como parte de los proyectos de instrumentación para la ejecución del Modelo de Innovación Curricular en la UAEMéx, implementado en el 2003, se asumió a partir del 2010 por el Cuerpo Académico en Educación y Enseñanza de la Geografía, la propuesta de un programa de capacitación docente denominado “Diplomado en Programación Pedagógica para la Docencia Universitaria por Competencias”, una propuesta apoyada por la Dirección de Desarrollo del Personal Académico DIDEPA con la plataforma Moodle que se ha realizado en tres promociones de 2010 al 2013
- …